Windows Subsystem for Linux Plan 9 Protocol Research Overview
This is the final blog in the McAfee research series trilogy on the Windows Subsystem for Linux (WSL) implementation – see The Twin Journey (part 1) and Knock, Knock–Who’s There (part 2). The previous research discussed file evasion attacks when the Microsoft P9 server can be hijacked with a malicious P9 (Plan 9 File System Protocol) server. Since Windows 10 version 1903, it is possible to access Linux files from Windows by using the P9 protocol. The Windows 10 operating system comes with the P9 server as part of the WSL install so that it can communicate with a Linux filesystem. In this research we explore the P9 protocol implementation within the Windows kernel and whether we could execute code in it from a malicious P9 server. We created a malicious P9 server by hijacking the Microsoft P9 server and replacing it with code we can control.
In a typical attack scenario, we discovered that if WSL is enabled on Windows 10, then a non-privileged local attacker can hijack the WSL P9 communication channel to cause a local Denial of Service (DoS) or Blue Screen of Death (BSOD) in the Windows kernel. It is not possible to achieve escalation of privilege (EoP) within the Windows kernel due to this vulnerability; the BSOD appears to be as designed by Microsoft within their legitimate fail flow, if malformed P9 server communication packets are received by the Windows kernel. A non-privileged user should not be able to BSOD the Windows kernel, from a local or remote perspective. If WSL is not enabled (disabled by default on Windows 10), the attack can still be executed but requires the attacker to be a privileged user to enable WSL as a pre-requisite.
There have recently been some critical, wormable protocol vulnerabilities within the RDP and SMB protocols in the form of Bluekeep and SMBGhost. Remotely exploitable vulnerabilities are very high risk if they are wormable as they can spread across systems without any user interaction. Local vulnerabilities are lower risk since an attacker must first have a presence on the system; in this case they must have a malicious P9 server executing. The P9 protocol implementation runs locally within the Windows kernel so the objective, as with most local vulnerability hunting, is to find a vulnerability that allows an escalation of privilege (EoP).
In this blog we do a deep dive into the protocol implementation and vulnerability hunting process. There is no risk to WSL users from this research, which has been shared with and validated by Microsoft. We hope this research will help improve understanding of the WSL P9 communications stack and that additional research would be more fruitful further up the stack.
There have been some exploits on WSL such as here and here but there appears to be no documented research of the P9 protocol implementation other than this.
P9 Protocol Overview
The Plan 9 File System Protocol server allows a client to navigate its file system to create, remove, read and write files. The client sends requests (T-messages) to the server and the server responds with R-messages. The P9 protocol has a header consisting of size, type and tag fields which is followed by a message type field depending on the request from the client. The R-message type sent by the server must match the T-message type initiated from the client. The maximum connection size for the data transfer is decided by the client during connection setup; in our analysis below, it is 0x10000 bytes.
P9 protocol header followed by message type union (we have only included the subset of P9 message types which are of interest for vulnerability research):
struct P9Packet {
u32 size; u8 type; u16 tag; union { struct p9_rversion rversion; struct p9_rread rread; struct p9_rreaddir rreaddir; struct p9_rwalk rwalk; } u } P9Packet |
The P9 T-message and corresponding R-message numbers for the types we are interested in (the R-message is always T-message+1):
enum p9_msg_t {
P9_TREADDIR = 40, P9_RREADDIR = 41, P9_TVERSION = 100, P9_RVERSION = 101, P9_TWALK = 110, P9_RWALK = 111, P9_TREAD = 116, P9_RREAD = 117, } |
At the message type layer, which follows the P9 protocol header, you can see the fields, which are of variable size, highlighted below:
struct p9_rwalk {
u16 nwqid; struct p9_qid wqids[P9_MAXWELEM]; } |
struct p9_rread {
u32 count; u8 *data; } |
struct p9_rreaddir {
u32 count; u8 *data; } |
struct p9_rversion {
u32 msize; struct p9_str version; } |
struct p9_str {
u16 len; char *str; } |
Based on the packet structure of the P9 protocol we need to hunt for message type confusion and memory corruption vulnerabilities such as out of bounds read/write.
So, what will a packet structure look like in memory? Figure 1 shows the protocol header and message type memory layout from WinDbg. The message size (msize) is negotiated to 0x10000 and the version string is “9P2000.W”.
Windows WSL P9 Communication Stack and Data Structures
The p9rdr.sys network mini-redirector driver registers the “\\Device\\P9Rdr” device with the Redirected Drive Buffering Subsystem (RDBSS) using the RxRegisterMinirdr API as part of the p9rdr DriverEntry routine. During this registration, the following P9 APIs or driver routines are exposed to the RDBSS:
P9NotImplemented
P9Start P9Stop P9DevFcbXXXControlFile P9CreateSrvCall P9CreateVNetRoot P9ExtractNetRootName P9FinalizeSrvCall P9FinalizeVNetRoot P9Create P9CheckForCollapsibleOpen P9CleanupFobx P9CloseSrvOpen P9ForceClosed P9ExtendFile P9Flush P9QueryDirectoryInfo P9QueryVolumeInfo P9QueryFileInfo P9SetFileInfo P9IsValidDirectory P9Read P9Write |
The p9rdr driver is not directly accessible from user mode using the DeviceIoControl API and all calls must go through the RDBSS.
As seen in Figure 2, when a user navigates to the WSL share at “\\wsl$” from Explorer, the RDBSS driver calls into the P9 driver through the previously registered APIs.
DIOD is a file server implementation, that we modified to be a “malicious” P9 server, where we claim the “fsserver” socket name prior to the Windows OS in a form of squatting attack. Once we replaced the Microsoft P9 server with the DIOD server, we modified the “np_req_respond” function (explained in the fuzzing constraints section) so that we could control P9 packets to send malicious responses to the Windows kernel. Our malicious P9 server and socket hijacking have been explained in detail here.
So now we know how data travels from Explorer to the P9 driver but how does the P9 driver communicate with the malicious P9 server? They communicate over AF_UNIX sockets.
There are two important data structures used for controlling data flow within the P9 driver called P9Client and P9Exchange.
The P9Client and P9Exchange data structures, when reverse engineered to the fields relevant to this research, look like the following (fields not relevant to this analysis have been labelled as UINT64 for alignment):
typedef struct P9Client { PVOID * WskTransport_vftable PVOID * GlobalDevice UNINT64 RunRef WskSocket *WskData UINT64 UINT64 UINT_PTR PVOID *MidExchangeMgr_vftable PRDBSS_DEVICE_OBJECT *RDBSS UINT64 PVOID **WskTransport_vftable PVOID **MidExchangeMgr_vftable P9Packet *P9PacketStart UINT64 MaxConnectionSize UINT64 Rmessage_size P9Packet *P9PacketEnd UINT_PTR UINT64 UINT64 UINT_PTR UINT64 UINT64 PVOID * Session_ReconnectCallback PVOID ** WskTransport_vftable UINT64 UINT_PTR UINT_PTR UINT64 UINT_PTR UINT64 UINT64 UINT64 } P9Client |
P9Client data structure memory layout in WinDbg:
typedef struct P9Exchange { UINT64 UINT64 P9Client *P9Client UINT64 Tmessage_type UINT64 UINT_PTR PVOID *Lambda_PTR1 PVOID *Lambda_PTR2 PRX_CONTEXT *RxContextUINT64 Tmessage_size UINT64 UINT64 UINT64 UINT64 UINT64 UINT64 } P9Exchange |
The P9Exchange data structure layout in WinDbg:
To communicate with the P9 server, the P9 driver creates an I/O request packet (IRP) to receive data from the Winsock Kernel (WSK). An important point to note is that the Memory Descriptor List (MDL) used to hold the data passed between the P9 server and Windows kernel P9 client is 0x10000 bytes (the max connection size mentioned earlier).
virtual long WskTransport::Receive(){
UNINT64 MaxConnectionSize = 0x10000; P9_IRP_OBJECT = RxCeAllocateIrpWithMDL(2, 0, 0i64); P9_MDL = IoAllocateMdl(P9Client->P9PacketStart, MaxConnectionSize, 0, 0, 0i64); P9_IRP_OBJECT->IoStackLocation->Parameters->P9Client = &P9Client; P9_IRP_OBJECT->IoStackLocation->Parameters->DataPath = &P9Client::ReceiveCallback; |
The MDL is mapped to the P9PacketStart field address within the P9Client data structure.
On IRP completion, the WskTransport::SendReceiveComplete completion routine is called to retrieve the P9Client structure from the IRP to process the P9 packet response from the server:
int static WskTransport::SendreceiveComplete(IRP *P9_IRP_OBJECT){
P9Client = &P9_IRP_OBJECT->IoStackLocation->Parameters->P9Client; P9Client::ReceiveCallback(P9Client* P9Client); } |
The P9Client data structure is used within an IRP to receive the R-message data but what is the purpose of the P9Exchange data structure?
- When the P9 driver sends a T-message to the server, it must create an exchange so that it can track the state between the message type sent (T-message) and that returned by the server (R-Message).
- It contains lambda functions to execute on the specific message type. The Tmessage_type field within the P9Exchange data structure ensures that the server can only send R-messages to the same T-message type it received from the P9 driver.
- PRX_CONTEXT * RxContext structure is used to transfer data between Explorer and the p9rdr driver via the RDBSS driver.
The flow of a WALK T-message can be seen below:
Within the P9Client::CreateExchange function, the MidExchangeManager::RegisterExchange is responsible for registering the P9Exchange data structure with the RDBSS using a multiplex ID (MID) to distinguish between concurrent server and client requests.
MidExchangeManager::RegisterExchange (*P9Client, *P9Exchange){
NTSTATUS RxAssociateContextWithMid (PRX_MID_ATLAS P9Client->RDBSS, PVOID P9Exchange, PUSHORT NewMid); } |
The important fields within the P9Client and P9Exchange data structures which we will discuss further during the analysis:
- PClient->MaxConnectionSize – set at the start of the connection and cannot be controlled by an attacker
- P9Client->P9PacketStart – points to P9 packet received and can be fully controlled by an attacker
- P9Client->Rmessage_size –can be fully controlled by an attacker
- P9Exchange->Tmessage_type – set during T-message creation and cannot be controlled by an attacker
- P9Exchange->RxContext – used to pass data from P9 driver through the RDBSS to Explorer
Now that we know how the protocol works within the Windows kernel, the next stage is vulnerability hunting.
Windows Kernel P9 Server Vulnerability Hunting
P9 Packet Processing Logic
From a vulnerability perspective we want to audit the Windows kernel logic within p9rdr.sys, responsible for parsing traffic from the malicious P9 server. Figure 3 shows the source of the P9 packet and the sink, or where the packet processing completes within the p9rdr driver.
Now that we have identified the code for parsing the P9 protocol message types of interest we need to audit the code for message type confusion and memory corruption vulnerabilities such as out of bounds read/write and overflows.
Fuzzing constraints
There were a number of constraints which made deploying automated fuzzing logic difficult:
- The R-message type sent from the malicious P9 server must match the T-message type sent by the Windows kernel
- Timeouts in higher layers of the WSL stack
The above challenges could, however, be overcome but since the protocol is relatively simple we decided to focus on reversing the processing logic validation. To verify the processing logic validation, we created some manual fuzzing capability within the malicious P9 server to test the variable length packet field boundaries identified from the protocol overview.
Below is an example RREAD R-message type which sends a malicious P9 packet in response to an RREAD T-message where we control the count and data variable length fields.
srv.c
void np_req_respond(Npreq *req, Npfcall *rc) { NP_ASSERT (rc != NULL); xpthread_mutex_lock(&req->lock);
u32 count = 0xFFFFFFFF; Npfcall *fake_rc; u8 *data = malloc(0xFFF0); memset(data, “A”, 0xFFF0);
if (!(fake_rc = np_alloc_rread1(count))) return NULL; if (fake_rc->u.rread.data) memmove(fake_rc->u.rread.data, data, count);
if(rc->type == 0x75){ fprintf (stderr, “RREAD Packet Reply”); req->rcall = fake_rc; } else{ req->rcall =rc; } if (req->state == REQ_NORMAL) { np_set_tag(req->rcall, req->tag); np_conn_respond(req); } xpthread_mutex_unlock(&req->lock); } |
Validation Checks
The data passed to the P9 driver is contained within a connection memory allocation of 0x10000 bytes (P9Client->P9PacketStart) and most of the processing is done within this memory allocation, with two exceptions where memmove is called within the P9Client::FillData and P9Client::Lambda_2275 functions (discussed below).
A message-type confusion attack is not possible since the P9Exchange data structure tracks the R-message to its corresponding T-message type.
In addition, the P9 driver uses a span reader to process message type fields of static length. The P9Exchange structure stores the message type which is used to determine the number of fields within a message during processing.
While we can control the P9 packet size we cannot control the P9Client->MaxConnectionSize which means messages greater than or equal to 0x10000 will be dropped.
All variable size field checks within the message type layer of the protocol are checked against the P9Packet size field ensuring that a malicious field will not result in out of bounds read or write access outside of the 0x10000 connection memory allocation.
The processing logic functions identified previously were reverse engineered to understand the validation on the protocol’s fields, with specific focus on the variable length fields within message types rversion, rwalk and rread.
By importing the P9Client and P9Exchange data structures into IDA Pro, the reverse engineering process relatively straight forward to understand the packet validation logic. The functions below have been reversed to the level required for understanding the validation and are not representative of the entire function code base.
P9Client::ReceiveCallback validates that the Rmessage_size does not exceed the max connection size of 0x10000
void P9Client::ReceiveCallback ( P9Client *P9Client){ struct p9packet;uint64 MaxConnectionSize;uint64 Rmessage_size;MaxConnectionSize = P9Client-> MaxConnectionSize; Rmessage_size = P9Client->Rmessage_size;if(MaxConnectionSize) { P9Packet = (struct p9packet *) P9Client-> P9PacketStart;if (MaxConnectionSize < 0 || !P9Packet) terminate(P9Packet);}if (Rmessage_size >=0 && P9Client->MaxConnectionSize >= Rmessage_size) { P9Client::HandleReply (*P9Client) } else{terminate(P9Packet); } |
P9Client::HandleReply – there are multiple local DoS here which result in a Blue Screen Of Death (BSOD) depending on the size of P9Client->Rmessage_size and P9Client->P9PacketEnd->size, e.g. when P9Client->P9PacketEnd->size is zero terminate() is called which is BSOD.
void P9Client::HandleReply(P9Client *P9Client){
uint64 P9PacketHeaderSize = 7; uint64 Rmessage_size = P9Client->Rmessage_size; if (Rmessage_size >=7){ P9PacketEnd = P9Client->P9PacketEnd; if(!P9PacketEnd) break; uint64 P9PacketSize = P9Client->P9PacketEnd->size; if (Rmessage_size < P9PacketSize); P9Client::FillData(); if(Rmessage_size < 4) terminate(); // checking a P9 header size field exists in packet if(Rmessage_size > 5) fastfail(); // checking a P9 header type field exists in packet int message_type = P9PacketEnd->type; if(Rmessage_size < 7) fastfail(); // checking a P9 header tag field exists in packet uint64 tag = P9PacketEnd->tag; uint64 P9message_size = P9PacketSize – P9PacketHeaderSize; //getting size of message if (Rmessage_size – 7 < 0) terminate(); // checking message layer exists after P9 header if (Rmessage_size – 7 < P9message_size); terminate(); //BSOD here as when set P9PacketSize = 0 then subtracting 7 wraps around so P9message_size becomes greater than Rmessage_size. void P9Client::ProcessReply(P9Client *P9Client, Rmessage_type, tag, &P9message_size); } } else { P9Client::FillData(); } |
P9Client::FillData – we cannot reach this function with a large Rmessage_size to force an out of bounds write.
int P9Client::FillData (P9Client *P9Client){ uint64 Rmessage_size = P9Client-> Rmessage_size;uint_ptr P9PacketEnd = P9Client->P9PacketEnd; uint_ptr P9PacketStart = P9Client->P9PacketStart;if (P9PacketEnd != P9PacketStart) { memmove (P9PacketStart, P9PacketEnd, Rmessage_size); } |
ProcessReply checks the R-message type with that from the T-message within the P9Exchange data structure.
void P9Client::ProcessReply(P9Client *P9Client, Rmessage_type, tag, &P9message_size){ P9Exchange *P9Exchange = MidExchangeManager::FindAndRemove(*P9Client, &P9Exchange);if (P9Packet->tag > 0) { int message_type_size = GetMessageSize (P9Exchange->Tmessage_type); if (P9message_size >= message_type_size) {int rmessage_type = P9Exchange->MessageType;int rmessage_type = rmessage_type +1;} if(rmessage_type > 72){ Switch (MessageType){ case 100: P9Client::ProcessVersionReply(P9Client *P9Client, P9Exchange, &P9message_size); case 110: P9Client::ProcessWalkreply(Rmessage_type, P9Exchange, &P9message_size);} }else { P9Client::ProcessReadReply(rmessage_type, P9Exchange, &P9message_size); }} |
During the P9Client::ProcessReply function it calls MidExchangeManager::FindAndRemove to fetch the P9Exchange data structure associated with the R-messages corresponding T-message.
MidExchangeManager::FindAndRemove (*P9Client, &P9Exchange){
NTSTATUS RxMapAndDissociateMidFromContext(PRX_MID_ATLAS P9Client->RDBSS_RxContext, USHORT Mid, &P9Exchange); } |
ProcessVersionReply checks the version sent by Client “P92000.L” which is 8 characters and checks the same length on return so the rversionlen does not affect the tryString function.
void P9Client::ProcessVersionReply (*P9Client, *P9Exchange, & P9message_size) {
char * rversion; rversion = P9Client->P9PacketStart.u.rversion->version->str; rversionlen = P9Client->P9PacketStart.u.rversion->version->len; tryString (messagesize, &rversion) strcmp (Tversion, Rversion); |
ProcessWalkReply checks that the total number of rwalk structures does not exceed the P9message_size
void P9Client::ProcessWalkReply(rmessage_type, *P9Exchange, &P9message_size){
uint16 nwqid = p9packet.rwalk.nwqid; uint64 rwalkpacket_size = &P9message_size – 2; // 2 bytes of rwalk header for nwqid field unit_ptr rwalkpacketstart = &P9Client->P9PacketStart.u.rwalk->wqids; if (rwalk_message_size <= P9message_size) { P9Exchange->Lambda_8972 (int, nwqid, &rwalk_message_size, P9Exchange-> RxContext, & rwalkpacketstart); // Lambda_8972 is Lambda_PTR1 for rwalk message type } else { P9Exchange->P9Client::SyncContextErrorCallback (error_code, P9Exchange-> RxContext) // SyncContextErrorCallback is Lambda_PTR2 for rwalk message type } |
ProcessReadReply checks the size of the count field does not exceed 0x8000 and writes it into an MDL within P9Exchange-> RxContext to pass back up the RDBSS stack to view file contents within Explorer.
void P9Client::ProcessReadReply (rmessage_type, *P9Exchange, &P9message_size){ unint64 count = P9Client->P9PacketStart.u.rread->count; P9Exchange->Lambda_2275 (count, P9Exchange-> RxContext, &P9message_size);} |
Lambda_2275 (count, P9Exchange-> RxContext, &P9message_size) {
uint64 maxsize = P9Exchange-> RxContext+offset; //max_size = 0x8000 unint64 MDL = P9Exchange-> RxContext+offset; if (count > maxsize) terminate(); memmove (&MDL, P9Client->P9PacketStart.u.rread->data, count); } |
Conclusion
Through this research, we discovered a local Denial of Service (DoS) within the Windows kernel implementation of the P9 protocol. As explained, the vulnerability cannot be exploited to gain code execution within the Windows kernel so there is no risk to users from this specific vulnerability. As a pre-requisite to malicious P9 server attacks, an attacker must hijack the P9 server socket “fsserver”. Therefore, we can mitigate this attack by detecting and preventing hijacking of the socket “fsserver”. McAfee MVISION Endpoint and EDR can detect and prevent coverage against P9 server socket “fsserver” hijacking which you can read more about here.
We hope this research provides insights into the following:
- The vulnerability hunting process for new features such as the WSL P9 protocol on the Windows 10 OS
- Provide support for future research higher up the WSL communications stack which increases in complexity due to the implementation of a virtual Linux file system on Windows
- The value of McAfee Advanced Threat Research (ATR) working closely with our product and innovation teams to provide protection for our customers
Finally, a special thanks to Leandro Costantino and Cedric Cochin for their initial Windows 10 WSL P9 server research.